|
Non-linear least squares is the form of least squares analysis used to fit a set of ''m'' observations with a model that is non-linear in ''n'' unknown parameters (''m'' > ''n''). It is used in some forms of non-linear regression. The basis of the method is to approximate the model by a linear one and to refine the parameters by successive iterations. There are many similarities to linear least squares, but also some significant differences. == Theory == Consider a set of data points, and a curve (model function) that in addition to the variable also depends on parameters, with It is desired to find the vector of parameters such that the curve fits best the given data in the least squares sense, that is, the sum of squares : is minimized, where the residuals (errors) ''ri'' are given by : for The minimum value of ''S'' occurs when the gradient is zero. Since the model contains ''n'' parameters there are ''n'' gradient equations: : In a non-linear system, the derivatives are functions of both the independent variable and the parameters, so these gradient equations do not have a closed solution. Instead, initial values must be chosen for the parameters. Then, the parameters are refined iteratively, that is, the values are obtained by successive approximation, : Here, ''k'' is an iteration number and the vector of increments, is known as the shift vector. At each iteration the model is linearized by approximation to a first-order Taylor series expansion about : The Jacobian, J, is a function of constants, the independent variable ''and'' the parameters, so it changes from one iteration to the next. Thus, in terms of the linearized model, and the residuals are given by : Substituting these expressions into the gradient equations, they become : which, on rearrangement, become ''n'' simultaneous linear equations, the normal equations : The normal equations are written in matrix notation as : When the observations are not equally reliable, a weighted sum of squares may be minimized, : Each element of the diagonal weight matrix W should, ideally, be equal to the reciprocal of the error variance of the measurement.〔This implies that the observations are uncorrelated. If the observations are correlated, the expression : applies. In this case the weight matrix should ideally be equal to the inverse of the error variance-covariance matrix of the observations.〕 The normal equations are then : These equations form the basis for the Gauss–Newton algorithm for a non-linear least squares problem.
|